智能论文笔记

Towards In-distribution Compatibility in Out-of-distribution Detection

Boxi Wu , Jie Jiang , Haidong Ren , Zifan Du , Wenxiao Wang , Zhifeng Li , Deng Cai , Xiaofei He , Binbin Lin , Wei Liu

分类：计算机视觉 | 机器学习

2022-08-29

尽管具有明显的区分靶向分布样本的能力，但深度神经网络在检测异常分布数据方面的性能差。为了解决此缺陷，最先进的解决方案选择在离群值的辅助数据集上训练深网。这些辅助离群值的各种培训标准是根据启发式直觉提出的。但是，我们发现这些直观设计的离群训练标准可能会损害分布学习，并最终导致劣等的表现。为此，我们确定了分布不兼容的三个原因：矛盾的梯度，错误的可能性和分布变化。基于我们的新理解，我们通过调整深层模型和损耗函数的顶级设计，提出一种新的分布检测方法。我们的方法通过减少对分布特征的概率特征的干扰来实现分布兼容性。在几个基准上，我们的方法不仅可以实现最新的分布检测性能，而且还提高了分布精度。

translated by 谷歌翻译

HTML版本

Coordinated Proximal Policy Optimization

Zifan Wu , Chao Yu , Deheng Ye , Junge Zhang , Haiyin Piao , Hankz Hankui Zhuo

分类：人工智能

2021-11-07

我们呈现协调的近端策略优化（COPPO），该算法将原始近端策略优化（PPO）扩展到多功能代理设置。关键的想法在于多个代理之间的策略更新过程中的步骤大小的协调适应。当优化理论上接地的联合目标时，我们证明了政策改进的单调性，并基于一组近似推导了简化的优化目标。然后，我们解释了Coppo中的这种目标可以在代理商之间实现动态信用分配，从而减轻了代理政策的同时更新期间的高方差问题。最后，我们证明COPPO优于几种强大的基线，并且在典型的多代理设置下，包括最新的多代理PPO方法（即MAPPO），包括合作矩阵游戏和星际争霸II微管理任务。

translated by 谷歌翻译

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

Yinghao Xu , Menglei Chai , Zifan Shi , Sida Peng , Ivan Skorokhodov , Aliaksandr Siarohin , Ceyuan Yang , Yujun Shen , Hsin-Ying Lee , Bolei Zhou

分类：计算机视觉

2022-12-22

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3Daware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing. Based on such a prior, the proposed model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and editing flexibility of individual objects while being able to efficiently compose objects and the background into a complete scene. We demonstrate state-of-the-art performance on many scene datasets, including the challenging Waymo outdoor dataset. Project page: https://snap-research.github.io/discoscene/

translated by 谷歌翻译

AutoSlicer: Scalable Automated Data Slicing for ML Model Analysis

Zifan Liu , Evan Rosen , Paul Suganthan G. C

分类：机器学习

2022-12-18

Automated slicing aims to identify subsets of evaluation data where a trained model performs anomalously. This is an important problem for machine learning pipelines in production since it plays a key role in model debugging and comparison, as well as the diagnosis of fairness issues. Scalability has become a critical requirement for any automated slicing system due to the large search space of possible slices and the growing scale of data. We present Autoslicer, a scalable system that searches for problematic slices through distributed metric computation and hypothesis testing. We develop an efficient strategy that reduces the search space through pruning and prioritization. In the experiments, we show that our search strategy finds most of the anomalous slices by inspecting a small portion of the search space.

translated by 谷歌翻译

Considerations for meaningful sign language machine translation based on glosses

Mathias Müller , Zifan Jiang , Amit Moryossef , Annette Rios , Sarah Ebling

分类：自然语言处理 | 人工智能

2022-11-28

Automatic sign language processing is gaining popularity in Natural Language Processing (NLP) research (Yin et al., 2021). In machine translation (MT) in particular, sign language translation based on glosses is a prominent approach. In this paper, we review recent works on neural gloss translation. We find that limitations of glosses in general and limitations of specific datasets are not discussed in a transparent manner and that there is no common standard for evaluation. To address these issues, we put forward concrete recommendations for future research on gloss translation. Our suggestions advocate awareness of the inherent limitations of gloss-based approaches, realistic datasets, stronger baselines and convincing evaluation.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Enhanced Low-resolution LiDAR-Camera Calibration Via Depth Interpolation and Supervised Contrastive Learning

Zhikang Zhang , Zifan Yu , Suya You , Raghuveer Rao , Sanjeev Agarwal , Fengbo Ren

分类：计算机视觉

2022-11-08

Motivated by the increasing application of low-resolution LiDAR recently, we target the problem of low-resolution LiDAR-camera calibration in this work. The main challenges are two-fold: sparsity and noise in point clouds. To address the problem, we propose to apply depth interpolation to increase the point density and supervised contrastive learning to learn noise-resistant features. The experiments on RELLIS-3D demonstrate that our approach achieves an average mean absolute rotation/translation errors of 0.15cm/0.33\textdegree on 32-channel LiDAR point cloud data, which significantly outperforms all reference methods.

translated by 谷歌翻译

Automatic Error Detection in Integrated Circuits Image Segmentation: A Data-driven Approach

Zhikang Zhang , Bruno Machado Trindade , Michael Green , Zifan Yu , Christopher Pawlowicz , Fengbo Ren

分类：计算机视觉

2022-11-08

Due to the complicated nanoscale structures of current integrated circuits(IC) builds and low error tolerance of IC image segmentation tasks, most existing automated IC image segmentation approaches require human experts for visual inspection to ensure correctness, which is one of the major bottlenecks in large-scale industrial applications. In this paper, we present the first data-driven automatic error detection approach targeting two types of IC segmentation errors: wire errors and via errors. On an IC image dataset collected from real industry, we demonstrate that, by adapting existing CNN-based approaches of image classification and image translation with additional pre-processing and post-processing techniques, we are able to achieve recall/precision of 0.92/0.93 in wire error detection and 0.96/0.90 in via error detection, respectively.

translated by 谷歌翻译

Deep Generative Models on 3D Representations: A Survey

Zifan Shi , Sida Peng , Yinghao Xu , Yiyi Liao , Yujun Shen

分类：计算机视觉

2022-10-27

Generative models, as an important family of statistical modeling, target learning the observed data distribution via generating new instances. Along with the rise of neural networks, deep generative models, such as variational autoencoders (VAEs) and generative adversarial network (GANs), have made tremendous progress in 2D image synthesis. Recently, researchers switch their attentions from the 2D space to the 3D space considering that 3D data better aligns with our physical world and hence enjoys great potential in practice. However, unlike a 2D image, which owns an efficient representation (i.e., pixel grid) by nature, representing 3D data could face far more challenges. Concretely, we would expect an ideal 3D representation to be capable enough to model shapes and appearances in details, and to be highly efficient so as to model high-resolution data with fast speed and low memory cost. However, existing 3D representations, such as point clouds, meshes, and recent neural fields, usually fail to meet the above requirements simultaneously. In this survey, we make a thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations. We hope that our discussion could help the community track the evolution of this field and further spark some innovative ideas to advance this challenging task.

translated by 谷歌翻译

A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

Zifan Wang , Yi Shen , Zachary I. Bell , Scott Nivison , Michael M. Zavlanos , Karl H. Johansson

分类：机器学习 | (统计)机器学习

2022-09-06

我们考虑在重复的未知游戏中进行规避风险的学习，在这种游戏中，代理商的目标是最大程度地减少其个人产生高成本的风险。具体而言，代理商使用处于风险的条件值（CVAR）作为风险措施，并以每集选定动作的成本值的形式依靠强盗反馈来估算其CVAR值并更新其动作。使用匪徒反馈来估计CVAR的一个主要挑战是，代理只能访问其自身的成本值，但是，这取决于所有代理的行为。为了应对这一挑战，我们提出了一种新的规避风险的学习算法，并利用有关成本价值的完整历史信息。我们表明，该算法实现了子线性的遗憾，并匹配了文献中最著名的算法。我们为欧洲大师游戏提供了数值实验，该游戏表明我们的方法表现优于现有方法。

translated by 谷歌翻译